Benchmark Molecular Haplotyping by Pyrosequencing

نویسندگان

  • Jacob Odeberg
  • Kristina Holmberg
  • Per Eriksson
  • Mathias Uhlén
چکیده

Several million SNPs have been identified and are available in the public databases. For most, no effect on gene function or expression is known. “Functional polymorphisms” refers to SNPs to which a putative effect on gene function or expression is attributed and implies that they are “non-silent” variations located in regions that are coding, regulatory, splice sites, or untranslated regions that affect mRNA stability and translation. In association studies (12), a polymorphism state that is correlated with the occurrence of disease suggests that the etiological agent is either in linkage with the analyzed polymorphism site or is identical with the polymorphism analyzed. It is likely that haplotypes, which are the specific combinations of nucleotides located on the same chromosomal molecule (allele), will be more informative on the complex relationship between phenotypes (disease) and DNA variation than any SNP. The disease association of a specific allele may be dependent on cis effects involving functional variants at other loci within the gene, and an association may not be found unless a sufficient marker for the haplotype on which a cis effect arises is typed. Regardless of possible cis effects, when an analyzed SNP has no effect on gene function but becomes a marker for a yet uncharacterized functional genetic variation within the same genomic region, if the functional variant(s) is of more recent origin, then the polymorphism state at the analyzed (non-functional) marker polymorphism may be shared between the allele harboring the functional variant(s) and one that does not. In such a case, genotype analysis of the marker polymorphism could fail to detect a significant association with disease existing for one of the underlying alleles. Based on studies of genomic variance in extended chromosomal regions and of the entire chromosome 21 (2,11), a picture of discrete haplotype blocks spanning up to 100 kb has emerged. Remarkably, the diversity is very limited, with a few haplotypes (two to four) accounting for more than 90% of the chromosomes/alleles present in a population sample (2,11). This implies that even if a large number of polymorphic sites exists in such haplotype blocks, then the corresponding haplotypes can be identified by using a small number of haplotype “tags” (6), which do not necessarily have to be spread out over a haplotype block. Thus, these haplotype tags can be a small subset of SNPs that uniquely distinguishes the different common haplotypes in each block and allows for exhaustive testing for whether common variation within a longer genomic region is associated with disease, regardless of whether it is the functional variant(s) that are being genotyped or not. Current methods for analyzing haplotypes include statistical estimation from population genotype data or from family genotype data (3,13,15). Direct molecular genetic typing methods such as allele-specific PCR, with match/mismatch 3′-end primers for two SNPs located some distance apart, have been described (10), but the robustness of this approach is potentially liable to be sequence dependent. For example, GT or CA mismatches are clearly not refractory to extension in PCR (5,8). We demonstrate here how the Pyrosequencing technique (Pyrosequencing AB, Uppsala, Sweden) (14) is applicable to molecular haplotyping, demonstrated for the Cathepsin S (CTSS) and Matrix metalloproteinase7 (MMP7) genes. Inherent to the sequencing-by-synthesis principle, on which the Pyrosequencing technique is based, is the possibility to design a nucleotide addition order with the extension over a polymorphic site becoming out of phase on the two different alleles, so the extension on one strand is lagging behind when passing over and continuing beyond the polymorphism. This results in distinctive raw data profiles that will distinguish between genotypes without ambiguity. By keeping the extension out-of-phase up to and beyond a second polymorphism, the raw data profile obtained at this site will be dependent on which nucleotide variant was present at the first polymorphic position on the same allele/chromosomal molecule. To illustrate the principle of out-of-phase extension, a haplotype marker in the CTSS gene, consisting of two polymorphic sites located 3 bp apart in the proximal promoter, is shown in Figure 1, where arrows indicate the position of the 3′-ends of the extended strands on the alleles after the fifth and eleventh nucleotide DRUG DISCOVERY AND GENOMIC TECHNOLOGIES

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovery of Single Nucleotide Polymorphisms and Mutations by Pyrosequencing

Comparative genomics, analyzing variation among individual genomes, is an area of intense investigation. DNA sequencing is usually employed to look for polymorphisms and mutations. Pyrosequencing, a real-time DNA sequencing method, is emerging as a popular platform for comparative genomics. Here we review the use of this technology for mutation scanning, polymorphism discovery and chemical hapl...

متن کامل

Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques

Determining the underlying haplotypes of individual human genomes is an essential, but currently difficult, step toward a complete understanding of genome function. Fosmid pool-based next-generation sequencing allows genome-wide generation of 40-kb haploid DNA segments, which can be phased into contiguous molecular haplotypes computationally by Single Individual Haplotyping (SIH). Many SIH algo...

متن کامل

Molecular haplotyping of genetic markers 10 kb apart by allele-specific long-range PCR.

Haplotypes, combinations of polymorphic markers in a chromosome, are critical for genome diversity research. However, their utility in population samplings is compromised by uncertain linkage phase determinations from unrelated individuals. Molecular haplotyping accomplishes direct phase determination by generation of hemizygous templates from diploid genomic samples. We report molecular haplot...

متن کامل

A comparison of several algorithms for the single individual SNP haplotyping reconstruction problem

MOTIVATION Single nucleotide polymorphisms are the most common form of variation in human DNA, and are involved in many research fields, from molecular biology to medical therapy. The technological opportunity to deal with long DNA sequences using shotgun sequencing has raised the problem of fragment recombination. In this regard, Single Individual Haplotyping (SIH) problem has received conside...

متن کامل

Probabilistic single-individual haplotyping

MOTIVATION Accurate haplotyping-determining from which parent particular portions of the genome are inherited-is still mostly an unresolved problem in genomics. This problem has only recently started to become tractable, thanks to the development of new long read sequencing technologies. Here, we introduce ProbHap, a haplotyping algorithm targeted at such technologies. The main algorithmic idea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002